Supervised classification in the presence of misclassified training data: a Monte Carlo simulation study in the three group case
نویسندگان
چکیده
Statistical classification of phenomena into observed groups is very common in the social and behavioral sciences. Statistical classification methods, however, are affected by the characteristics of the data under study. Statistical classification can be further complicated by initial misclassification of the observed groups. The purpose of this study is to investigate the impact of initial training data misclassification on several statistical classification and data mining techniques. Misclassification conditions in the three group case will be simulated and results will be presented in terms of overall as well as subgroup classification accuracy. Results show decreased classification accuracy as sample size, group separation and group size ratio decrease and as misclassification percentage increases with random forests demonstrating the highest accuracy across conditions.
منابع مشابه
Kinetic Monte Carlo Study of Biodiesel Production through Transesterification of Brassica Carinata Oil
In the present study, the kinetics of biodiesel production through transesterification of Brassica carinata oil with methanol in the presence of Potassium Hydroxide is investigated by kinetic Monte Carlo simulation. The obtained results from simulation agree qualitatively with the existing experimental data. The kinetics data for each step of suggested mechanism are confirmed by simulation. By ...
متن کاملApplying Point Estimation and Monte Carlo Simulation Methods in Solving Probabilistic Optimal Power Flow Considering Renewable Energy Uncertainties
The increasing penetration of renewable energy results in changing the traditional power system planning and operation tools. As the generated power by the renewable energy resources are probabilistically changed, the certain power system analysis tolls cannot be applied in this case. Probabilistic optimal power flow is one of the most useful tools regarding the power system analysis in presen...
متن کاملPopulation dynamic of Acipenser persicus by Monte Carlo simulation model and Bootstrap method in the southern Caspian Sea (Case study: Guilan province)
In this study population dynamic of Acipenser persicus with age structure model by Monte Carlo and Bootstrap approach was studied. Length frequency data a total of 4376 specimens collected from beach seine, fixed gill net and conservation force in coastal Guilan province during 2002 to 2012. Data imported to FiSAT II for length frequency analyze by ELEFAN 1. K, L∞ and t0 estimated 203, 0.08 and...
متن کاملA New Approach for Monte Carlo Simulation of RAFT Polymerization
In this work, based on experimental observations and exact theoretical predictions, the kinetic scheme of RAFT polymerization is extended to a wider range of reactions such as irreversible intermediate radical terminations and reversible transfer reactions. The reactions which have been labeled as kinetic scheme are the more probable existing reactions as the theoretical point of view. The ...
متن کاملEvaluation of glandular dose in mammography in the presence of breast cyst using Monte Carlo simulation
Introduction: Average glandular dose (AGD), entrance skin air kerma (ESAK) and normalized glandular dose (DgN) are the main dosimetric quantities in mammography. In this study, DgN is evaluated in the presence of breast cyst, which is a common disease among women and the influence of size, number and location of the cysts on the DgN is investigated. Materials and Meth...
متن کامل